nd 2
- North America > United States (0.28)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Spain > Basque Country (0.04)
- Workflow (0.49)
- Research Report (0.46)
- Leisure & Entertainment (0.46)
- Government > Regional Government (0.46)
Estimator
Observationso = δx are sampled with uniform distribution onx U[ 1,3](shown in blue) ˆfλ is calculated 500 times for different realizations of the training data (10 example predictors are shown in dashed lines), its mean and 2 standard deviation are shown in red. The true function f (x) = x2 +2cos(4x)is shown in black. Preliminary: Big-Pnotation Throughout our proofs, we will frequently rely on a polynomial analogue of the big-O notation, whichwecallbig-P: Definition1. Let us observe that all the quantities we study (the predictor, the risk and empirical risk) stay the sameifanyobservation oi isreplacedby oi. The existence and the uniqueness of the solution in the cone spanned by1and 1/z of theequation canbeargued asfollows.
- Europe > Austria > Vienna (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
The Cost of Robustness: Tighter Bounds on Parameter Complexity for Robust Memorization in ReLU Nets
Kim, Yujun, Moon, Chaewon, Yun, Chulhee
We study the parameter complexity of robust memorization for $\mathrm{ReLU}$ networks: the number of parameters required to interpolate any given dataset with $ε$-separation between differently labeled points, while ensuring predictions remain consistent within a $μ$-ball around each training sample. We establish upper and lower bounds on the parameter count as a function of the robustness ratio $ρ= μ/ ε$. Unlike prior work, we provide a fine-grained analysis across the entire range $ρ\in (0,1)$ and obtain tighter upper and lower bounds that improve upon existing results. Our findings reveal that the parameter complexity of robust memorization matches that of non-robust memorization when $ρ$ is small, but grows with increasing $ρ$.
- North America > United States (0.28)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Spain > Basque Country (0.04)
- Leisure & Entertainment (0.46)
- Education (0.46)
- Government > Regional Government (0.46)
In Appendix A we provide heuristic justification for the scaling of the optimal error rate
In Appendix D we provide the proofs for Theorem 7. In Appendix E we include some useful results for the sake of completeness. Informally, we expect that there is one sign flip (i.e., The top left, top right and bottom left figures show the scaling of the minimax rates of GLM (cf. To begin with the analysis of the estimator in Figure 2, the following lemma is a simple, yet key tool for the proof. It establishes the variance of the random gain S . The proof relies on a sort of self-bounding property (cf.
Supplementary Material for Fast Vision Transformers with HiLo Attention
Department of Data Science & AI, Monash University, Australia We organize our supplementary material as follows. In Section A, we describe the architecture specifications of LITv2. In Section B, we provide the derivation for the computational cost of HiLo attention. In Section C, we study the effect of window size based on CIFAR-100. In Section F, we provide more visualisation examples for spectrum analysis of HiLo attention. We use "ConvFFN Block" to differentiate our "ConvFFN" denotes our modified FFN layer where we adopt one layer of The overall framework of LITv2 is depicted in Figure I.
Capturing a Moving Target by Two Robots in the F2F Model
Jawhar, Khaled, Kranakis, Evangelos
We study a search problem on capturing a moving target on an infinite real line. Two autonomous mobile robots (which can move with a maximum speed of 1) are initially placed at the origin, while an oblivious moving target is initially placed at a distance $d$ away from the origin. The robots can move along the line in any direction, but the target is oblivious, cannot change direction, and moves either away from or toward the origin at a constant speed $v$. Our aim is to design efficient algorithms for the two robots to capture the target. The target is captured only when both robots are co-located with it. The robots communicate with each other only face-to-face (F2F), meaning they can exchange information only when co-located, while the target remains oblivious and has no communication capabilities. We design algorithms under various knowledge scenarios, which take into account the prior knowledge the robots have about the starting distance $d$, the direction of movement (either toward or away from the origin), and the speed $v$ of the target. As a measure of the efficiency of the algorithms, we use the competitive ratio, which is the ratio of the capture time of an algorithm with limited knowledge to the capture time in the full-knowledge model. In our analysis, we are mindful of the cost of changing direction of movement, and show how to accomplish the capture of the target with at most three direction changes (turns).
- North America > Canada > Ontario > National Capital Region > Ottawa (0.28)
- North America > United States > District of Columbia > Washington (0.06)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Israel (0.04)